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Abstract 

Background: Estrogens regulate diverse physiological processes in various tissues through genomic and non- 
genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon 
estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast 
cancer. Dynamic gene expression changes have been shown to characterize the breast cancer cell response to 
estrogens, the every molecular mechanism of which is still not well understood. 

Results: We developed a modulated empirical Bayes model, and constructed a novel topological and temporal 
transcription factor (TF) regulatory network in MCF7 breast cancer cell line upon stimulation by 17p-estradiol 
stimulation. In the network, significant TF genomic hubs were identified including ER-alpha and AP-1; significant 
non-genomic hubs include ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, and PITX2. Although the early and late 
networks were distinct (<5% overlap of ERa target genes between the 4 and 24 h time points), all nine hubs were 
significantly represented in both networks. In MCF7 cells with acquired resistance to tamoxifen, the ERa regulatory 
network was unresponsive to 17p-estradiol stimulation. The significant loss of hormone responsiveness was 
associated with marked epigenomic changes, including hyper- or hypo-methylation of promoter CpG islands and 
repressive histone methylations. 

Conclusions: We identified a number of estrogen regulated target genes and established estrogen-regulated 
network that distinguishes the genomic and non-genomic actions of estrogen receptor. Many gene targets of this 
network were not active anymore in anti-estrogen resistant cell lines, possibly because their DNA methylation and 
histone acetylation patterns have changed. 



Background 

Estrogens regulate diverse physiological processes in 
reproductive tissues and in mammary, cardiovascular, 
bone, liver, and brain tissues [1]. The most potent and 
dominant estrogen in human is 17P -estradiol (E2). The 
biological effects of estrogens are mediated primarily 
through estrogen receptors a and P (ER-a and -P), 
ligand-inducible transcription factors of the nuclear 
receptor superfamily. Estrogens control multiple 
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functions in hormone-responsive breast cancer cells [2], 
and ERa, in particular, plays a major role in the etiology 
of the disease, serving as a major prognostic marker and 
therapeutic target in breast cancer management [2]. 

Binding of hormone to receptor facilitates both geno- 
mic and non-genomic ERa activities to either activate 
or repress gene expression. Target gene regulation by 
ERa is accomplished primarily by four distinct mechan- 
isms (additional file 1) [3-5]: (i) ligand-dependent geno- 
mic action (i.e., direct binding genomic action or 
"DBGA"), in which ERa binds directly to estrogen 
response elements (ERE) in DNA. Candidate DBGA 
gene targets include PR and Bcl-2; (ii) ligand-dependent, 
ERE-independent genomic action (i.e., indirect binding 
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genomic action or "I-DBGA"). In I-DBGA, ERa regu- 
lates genes via protein-protein interactions with other 
transcription factors (such as c-Fos/c-Jun (AP-1), Spl, 
and nuclear factor-^B (NF/cB)) [4]. Target I-DBGA 
genes include MMP-1 and IGFNP4; (iii) Ligand-inde- 
pendent ERa signaling, in which gene activation occurs 
through second messengers downstream of peptide 
growth factor signaling (e.g., EGFR, IGFR, GPCR path- 
ways). Ligand-independent mechanism can be either 
DBGA or I-DBGA. These pathways alter intracellular 
kinase and phosphatase activity, induce alterations in 
ERa phosphorylation, and modify receptor action on 
genomic and non-genomic targets; (iv) rapid, non-geno- 
mic effects through membrane-associated receptors acti- 
vating signal transduction pathways such as MAPK and 
Akt pathways (i.e. non-genomic action, NGA). Note that 
the term, non-genomic effect, is based on the fact that 
estrodial signaling pathway doesn't involve ERa itself 
(additional file 1) and as a consequence there is no 
direct ERa mediated transcription. Furthermore, target 
genes can receive input from multiple estrogen actions, 
e.g., cyclin Dl is a target of multiple transcription fac- 
tors (TF): SP1, API, STAT5, and NF^B [3]. These four 
complex regulatory mechanisms, which describe the dis- 
tribution of ERa and co-regulators in the nucleus and 
membrane signal transduction proteins, are called topo- 
logical mechanisms and instrumental in sustaining 
breast cancer growth and progression. 

Dynamic gene expression changes characterize the 
breast cancer cell response to estrogens, and the kinetics 
of ERa target genes are strongly influenced by the hor- 
mone treatment times. Early work by Inoue et al. [6] 
revealed distinct gene clusters that correspond to either 
early or late E2-responsive genes. Frasor and co-workers 
[7] defined "early" responsive targets in MCF7 cells as 
genes up- or down-regulated by 8 h after E2 treatment; 
genes induced by 24 h post E2 treatment were classified 
as "late" responders and can be blocked by the protein 
translation inhibitor cycloheximide. It was further 
demonstrated that cyclin Dl expression was mediated 
by the interaction of ERa-Spl (early response) and by 
MAPK-activated EIk-2 and SRF [3] (later response). As 
ERa binding sites are more significantly associated with 
E2 up-regulated rather than down-regulated genes [8], 
Carroll et al. hypothesized that physiologic squelching is 
a primary cause of early down-regulation and late 
down-regulation is an ERa -mediated event. Collectively, 
these studies and many others [9] strongly support a 
temporal mechanism of ERa regulation. 

A number of gene regulatory network models have 
been developed to integrate ChlP-chip and gene expres- 
sion data, including genetic regulatory module algorithm 
(GRAM) [10], statistical analysis of network dynamics 



(SANDY) [11], Bayesian error analysis model (BEAM) 
[12], and two-stage constrained space factor analyses 
[13-15]. Although a unified model framework was used 
to establish regulatory networks, those computational 
approaches were not capable of distinguishing genomic 
and non-genomic mechanisms, presumably due to fail- 
ure to account for key differences in the type of data 
corresponding to genomic and non-genomic mechan- 
isms. ERa genomic targets consist of protein binding 
signals (ChlP-chip peaks), which is not the case for 
non-genomic targets, and thus models and regulation 
selection for genomic and non-genomic ERa regulatory 
mechanisms are different. In addition, although the 
above computational approaches join models for ChlP- 
chip and gene expression data, TF motif scans are not 
typically performed, making it difficult to infer ERa 
DBGA or I-DBGA targets from these approaches. 

In this study, we developed a new modulated empiri- 
cal Bayes approach to assemble the ERa regulatory net- 
work. Our approach, for the first time, differentiates 
topological features of ERa regulation mechanisms: 
DBGA, I-DBGA, and NGA. By examining the estrogen- 
responsive gene network in breast cancer cell models, 
we established that the ERa regulatory network changes 
over time. This modulated empirical Bayes model con- 
trols false positives arising from ChlP-chip binding data, 
TF binding site (TFBS) motif scans, and differential 
gene expression profiles. Two applications of this regu- 
latory network were studied. In the first application, the 
agonist/antagonist activities of two active metabolites of 
tamoxifen, 4- OH- tamoxifen and endoxifen, were investi- 
gated. The second application investigated the impact of 
epigenetics (DNA methylation and histone modifica- 
tions) on ERa regulatory network in our previously 
established breast cancer cell model of acquired tamoxi- 
fen resistance [16]. 

Results 

Data analyses overview 

The ERa regulatory network model was developed 
based on differential gene expression data for MCF7 
(untreated, 4 and 24 hour post E2 treatment) [16,17] 
and ERa ChlP-chip data [8]. The antagonistic/agonistic 
effects of OHT and endoxifen on this network were 
assessed using MCF7 gene expression microarray data 
at 24 hour post E2, OHT, endoxifen, E2+OHT, and E2 
+endoxifen treatments [17]. In MCF7 cells with 
acquired resistance to tamoxifen, the response of the 
ERa regulatory network was evaluated using gene 
expression microarray data [16], and the epigenetic 
mechanisms for non-responsive ERa network in MCF7- 
T cells were investigated by H3K4me2 and H3K27me3 
ChlP-seq data and MCIp-seq. 
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ERa regulation mechanisms and ERa targets 

Based on ERa ChlP-chip data and microarray mRNA 
expression profiles after E2 stimulation of MCF7 breast 
cancer cells, we categorized ERa regulatory mechanisms 
into three groups (additional file 2): genomic action 
with ERa direct ERE binding (DBGA), genomic action 
with ERa indirect/ERE-independent (e.g., AP-1) binding 
(I-DBGA), and non-genomic/ligand-independent action 
(NGA). In DBGA, the activation of ERa can be either 
by E2 (ligand-dependent) or growth factor-mediated 
phosphorylation (ligand independent) (additional file 1 
and additional file 2). Our current data is not able to 
distinguish between these two types of mechanisms. 

Different ERa mechanisms and their targets in MCF7 
cell are displayed in Figure 1. For the three ERa 
mechanisms described above, more up-regulated targets 
were observed than down-regulated targets after 4 hour 
E2 stimulation (Figure 1A). Both DBGA and NGA 
mechanisms have more targets than I-DBGA has. After 
24 hour E2 stimulation, a greater (p < 0.00001 vs. 4 
hour) number of down-regulated targets was observed 



for all three mechanisms (Figure IB &1C). These results 
are not totally consistent with the results in [8], as we 
use the 20% fold-change as an additional filtering criter- 
ion. Many significantly down-regulated genes have small 
fold change, especially after 4 hour E2 treatment. 

It is interesting to note that the number of DBGA and 
I-DBGA targets at 24 hour was approximately doubled 
compared to 4 hour, while an approximate 5-fold 
increase in the number of NGA targets was observed at 
24 hours (Figure 1A &1B). Furthermore, there was strik- 
ingly little overlap among the ERa targets between the 
two time points (8.5%, 5.8%, 3.8% for DBGA, I-DBGA, 
and NGA) respectively. 

Gene ontology enrichment analysis was performed for 
the genomic and non-genomic targets at 4 and 24 hour 
after E2 stimulation, and the top 5 functional categories 
are listed in Table 1 (p-value range for sub-functional 
categories is reported for each category). Although both 
genomic and non-genomic mechanisms share only a 
small number of targets, their functions are highly con- 
sistent. At both 4 and 24 hours, genomic targets are 



A. ERa Targets After 4 Hour E2 Stimulation 
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B. ERa Targets After 24 Hour E2 Stimulation 
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Figure 1 Statistics of ERa targets after E2 stimulation. (A) ERa targets after 4 hour E2 stimulation in MCF7 cells; (B) ERa targets after 24 hour 
E2 stimulation in MCF7 cells; (C) Comparisons of up/down-regulated targets within each of three ERa regulation mechanisms; and (D) ERa 
targets overlap between 4 and 24 hour after E2 stimulation. 
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Table 1 Gene Ontology Analysis of Estrogen Targets 
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mainly responsible for gene expression, cell morphology, 
cellular growth/development/movement, and cell cycle/ 
death. On the other hand, at both time points, non- 
genomic targets are attributed to RNA post-translational 
modification, DNA replication/re-combination/repair, 
amino acid metabolism, cellular assembly and organiza- 
tions. Therefore, genomic and non-genomic mechanisms 
have dramatically different impacts on the molecular 
and cellular functions in breast cancer cells. 

ERoc regulatory networks and their hubs 

After 4 hours of E2 stimulation, the ERa regulatory net- 
work is composed of an ERa hub and multiple inter- 
connected hubs (Figure 2A). Both ERa (DBGA) and 
Spl (I-DBGA) hubs are consistent with genomic 
mechanisms, while the other hubs follow non-genomic 
mechanisms. The target sizes of genomic and non-geno- 
mics hubs are approximately equal; however, after 24 
hour of E2 stimulation, there is a pronounced increase 
in the number of non-genomic hubs and targets com- 
pared to genomic hubs and targets (Figure 2B). These 
results demonstrate that while both genomic and non- 
genomic hubs are equally important, a greater number 
of late response E2 targets are activated through non- 
genomic mechanisms compared to genomic hubs. In 
addition, a striking feature of this dynamic ERa regula- 
tory network is that a consistent set of transcription fac- 
tors appear to control the hubs, despite the lack of 
overlap for hub targets between the two time points 
(discussed above; Figure ID). These factors include 
(ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, PITX2). 
Further comparison of the significant hubs between the 
4 and 24 hour networks shows that both statistical sig- 
nificance (p-value) and hub size are consistent between 



two time points for both genomic and non-genomic 
hubs (Figure 3). 

Antagonistic/Agonistic effects of tamoxifen metabolites: 
4-OH tamoxifen and endoxifen 

Different SERMs have been shown to have different 
antagonistic/agonistic on E2 up- and down-regulated 
genes [18]. The effect of the tamoxifen metabolites 
OHT and endoxifen, both well-known SERMS [17], on 
ERa target networks has not been compared, particu- 
larly with regard to ERa genomic/non-genomic targets. 
Among the ERa targets identified after 24 hour of E2 
stimulation, 17% and 14% were responsive to OHT and 
endoxifen respectively, with 74% of the targets overlap- 
ping (additional file 3). The agonist, antagonist, and par- 
tial agonist/antagonist activity of OHT and endoxifen on 
the ERa targets at 24 hour post E2 stimulation were 
nearly identical for the two SERMS (41%, 7%, 52% and 
40%, 7%, 53% for OHT and endoxifen, respectively; 
additional file 4). 

We further classified the effects of OHT and endoxi- 
fen on ERa genomic/non-genomic and up/down regula- 
tion. There was a tendency for a greater agonistic effect 
on ERa genomic targets than non-genomic targets after 
E2 or OHT treatment (p = 0.01; Figure 4A). However, 
this difference in agonistic activity on genomic/non- 
genomic targets was not seen (p = 0.67, Figure 4B) after 
E2 or endoxifen treatment. 

Epigenetic modifications impact the ERa regulatory 
network in tamoxifen resistant MCF7 cells 

Breast cancer cell models for acquired resistance to 
tamoxifen display progressive loss of estrogen-dependent 
signaling for cell growth and proliferation and a 
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Figure 2 ERoc regulatory network after E2 stimulation. (A) ERa regulatory network after 4 hours E2 stimulation in MCF7 cells; and (B) ERa 
regulatory network after 24 hours E2 stimulation in MCF7 cells. 



disrupted ERa regulatory network [16]. Among the ERa In order to understand the role of epigenetics in this non- 
targets observed after 4 hour E2 stimulation of MCF7, responsive ERa network, we investigated five possible 
only one target remained hormone responsive in the mechanisms (additional file 5): (A) high basal gene expres- 
tamoxifen-resistant MCF7-T subline (NRF1; Figure 5). sion in the MCF7-T cell; (B) hypermethylation (MCF7-T 
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Figure 3 Regularory hubs in ERoc regulatory network. (A) The correlation of the significance of hubs between 4 hour and 24 networks; and 
(B) The correlation of the significance of non-genomic hubs between 4 hour and 24 networks. Both axis are the -log(p-value), and the width 
and length of the squares represent the relative scales of hubs. 



vs MCF7) (C) hypomethylation (MCF7-T vs MCF7); (D) 
high methylation level in MCF7-T; and (C) high H3K27/ 
H3K4 ratio. As shown in Figure 6, these mechanisms 
account for approximately 27%, 19%, 15%, 34%, and 22% 
of the non-responsive targets (Figure 6A); however, these 
five mechanisms are not able to account for approx. 28% 
of targets. Substantial (36%) overlap was seen between 
hypermethylation (mechanism 2) and high basal methyla- 
tion in MCF7-T cell (mechanism 4) (Figure 6B). 

Validation studies 

Pol II-Binding. We compared PolII binding signals in 
MCF7 before and after 4 hour E2 stimulation. Nearly all 
ERa genomic targets displayed the same direction in 



fold-change between PolII binding and gene expression 
signals (98%; additional file 6A). Among the non-geno- 
mic targets, this concordance rate dropped slightly 
(86%). On the other hand, the concordance rate among 
non-targets was 55%. 

H3K4 Dimethylation is a well established histone mar- 
ker for transcription activation acetylation marker. We 
selected the median of H3K4 dimethylation ChlP-seq 
signal as the threshold. Almost all ERa genomic targets 
displayed H3K4 dimethylation higher than the median 
(94%, additional file 6B). Among the non-genomic tar- 
gets, this concordance rate dropped slightly (84%). On 
the other hand, the concordance rate among non-targets 
was 49%. 
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(B) Endoxifen 
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Figure 4 Effect of selective ERa modulators. (A) The agonistic effect of 4-OH tamoxifen is greater on genomic mechanism than on 
antagonistic or partial effects (p = 0.01). (B) No evidence for agonistic, antagonistic, or partial effects of endoxifen on genomic or non-genomics 
mechanisms. 
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Figure 5 ERa regulatory network in drug-resistant cells. ERa regulatory network in MCF7 cell after 4 hour E2 stimulation becomes non- 
responsive to E2 in the MCF7-T cell (only one target gene remains responsive). 



Overlap of 4 hour and 24 hour Estrogen Targets in the 
MCF7 Cell We used a different data set by Cicatiello et 
al [19], in which MCF7 cells were treated with E2, and 
sampled at baseline, 4 hr and 24 hr. This experiment 
was performed on a different gene expression platform, 
Illunima. We applied a similar empirical Bayes model 
and the same fold change threshold. We obtained a 
similar percentage of up/down regulated genes after 4h/ 
24h estrogen treatment. In addition, the overlap of 4 
and 24 hour gene targets was, 7%, similar to what we 
found out with our data. 

RT-qPCR, ChlP-PCR, and COBRA. We further investi- 
gated four types of epigenetics mechanisms. 

♦ Mechanism 1: GAB2 and LAMB2 were non- 
responsive in our network due to significantly 
increased basal expression in MCF7-T vs. MCF7 
(based on microarray data). Although RT-qPCR ana- 
lysis confirmed that GAB2 and LAMB2 expression 
was significantly higher in MCF7-T vs. MCF7 (Fig- 
ure 7A,B), both genes were slightly responsive to E2 
in MCF7-T. Our interpretation is that Affymetrix 



technology can be saturated for highly expressed 
genes, becoming insensitive to subtle expression 
changes. Nonetheless, the non-responsive mechan- 
ism needs further experimental investigation. 

♦ Mechanism 5: PGR, PLS3, SPATA13, GREB1, and 
MAOA were non-responsive because of a high ratio 
of H3K27me3:H3K4me2 in MCF7-T vs. MCF7. 
Using ChlP-PCR, this mechanism was validated in 
four of five target genes (Figure 7C,D,F,G; exception 
was SPATA13, Figure 7E). 

♦ Mechanisms 2 and 4: the DNA methylation status 
four ERa targets (PGR, PLS3, CREB1, SPATA13) 
was examined. Using COBRA assays, increased 
DNA methylation was observed in PGR and PLS3 
in MCF7-T compared to MCF7 (Figure 7H; 
mechanism 4), and increased methylation in the 
MCF7-T and the MCF7 (mechanism 2). Further- 
more, in the non-responsive ERa network, both 
PGR and PLS3 displayed both repressive epigenetic 
modifcations, the altered histone methylation ratio 
(mechanism 5) and altered DNA methylation 
(mechanism 2 and 4). 
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A. Non-Response Mechanism 
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Figure 6 Epigenetic mechanisms in drug-resistant cells 

Epigenetic mechanisms in ERa regulatory network in MCF7-T cell: 1 
high basal gene expression in MCF7-T cells; 2 hypermethylation 
from MCF7 cells to MCF7-T cells; 3 hypomethylation from MCF7 
cells to MCF7-H cells; 4 high basal methylation level in the MCF-T 
cells; 5 high H3K27/H3K4 ratio; and 6 unknown mechanisms. (A) The 
distribution of non-responsive mechanisms in ERa regulatory 
network in MCF7-T cell. (B) The overlap among 5 non-responsive 
mechanisms. 



Discussion 

Advantage of the modulated empirical bayes method in 
assembling a TF regulatory network model 

Our proposed ERa regulatory network model frame- 
work differs from existing methods in its ability to dis- 
tinguish between genomic and non-genomic actions, 
and the assumption for functional TFs. The pioneer TF 
regulatory network for Saccharomyces cerevisiae, devel- 
oped by Luscombe et al. [11] and Lee et al. [20], 
emphasized that TFs themselves should be highly 
expressed and display differences in expression level. 
However, these assumptions tend to be overly stringent 
and not suitable for our data. Our gene expression 
microarray data suggested that the majority of the TFs 
(more than 70%) are expressed at low levels in MCF7 



cells, and E2 stimulation results primarily in changes in 
TF phosphorylation state and not robust changes in TF 
expression in breast cancer cell lines, including MCF7 
[7,16,21]. All of the TFs in our genomic and non-geno- 
mic hubs didn't change their expression significantly 
(additional file 7 and additional file 8). Stringent statisti- 
cal models have recently been developed to establish TF 
regulatory networks [12,13,15]. Such regression-based 
approaches were not significant when used to analyze 
our data (not even for ERa itself), mainly due to the 
fact that TFs, including ERa, have both up- and down- 
regulated targets. If targets that change in opposite 
directions are not treated differently, the regression 
model will cancel-out any effect of a TF on gene expres- 
sion. Therefore, regression model-based approaches to 
identify TF regulatory networks can be sensitive to a 
mis-specified model. 

Our proposed empirical Bayes method modulates FDR 
calculations from differential gene expression data, 
ChlP-chip binding peaks, and TF motif scans. The 
inferred ERa regulatory network model has the follow- 
ing features and advantages: 

♦ Distinct genomic and non-genomic mechanisms. 

♦ Less stringent requirements on TF gene expression 
levels. 

♦ Modulated data analysis leading to robust conclu- 
sions with respect to model misspecifications. 

♦ Modulated model assembly results in an extend- 
able TF network, which is particularly useful when 
additional data becomes available for new molecular 
mechanisms. 



ERa regulatory network and corresponding hubs 

When constructing genomic targets of the ERa regula- 
tory network, TFs are scanned within a narrow region, 
45bp, of ERa ChlP-chip binding sites. This calculation 
scheme enables the identification of either DBGA or 
indirect I-DBGA. In many previous studies [8,22-24], 
relatively large neighborhoods surrounding the ERa 
binding site (around 500~1000bp) were scanned for 
consensus sequences of TFBSs. While this is an effective 
strategy for identifying co-regulatory TFs, it is not an 
effective approach for inferences regarding DBGA or I- 
DBGA. For example, Lin et al. [23] demonstrated that 
EREs and ERE half-sites were enriched for other tran- 
scription factors motifs, supporting the notion that TFs, 
in addition to ERa, can bind to ERE. In our analysis, we 
identified only Sp-1 as an I-DBGA. Although API has 
been reported to be an I-DBGA, in our data it did not 
pass the false positive threshold (FDR = 0.23), due to its 
relatively short TFBS (6 bp). Binding motifs for forkhead 
TFs have also been reported to be enriched within ERa 
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Figure 7 RT-PCR, ChlP-PCR and COBRA Validations 



binding regions in MCF7 cells by ChlP-chip [8]. How- 
ever, in our study, there was not sufficient evidence to 
support FoxAl as an I-DBGA (FDR = 0.34), a result 
supported by recent studies using ChlP-seq and ChlP- 
DSL [25-27]. Recently, RAR and ERa binding were 
shown to be highly coincident throughout the genome, 
competing for binding to the same or similar response 
elements [28]. Our ERa regulatory network model, how- 
ever, is not able to identify RAR targets, as the ChlP- 
chip experiments were only performed for ERa binding 
sites and not RAR. 

In our analysis, non-genomic targets of the ERa regu- 
latory network were constructed using genes whose pro- 
moters, introns, or downstream sequences were devoid 
of ERa ChlP-chip binding sites. Significant TF scan 
scores of these gene promoters infer ERa non-genomic 
action (NGA). It is worth noting that these NGA differ 
from previously described ERa co-regulator factors. 
NGA does not require ERa binding, in contrast to ERa 
co-regulatory factors which must display ERa binding 
peaks in the ChlP-chip analysis. Significant NGA tran- 
scription factors include ZFP161, TFDP1, NRF1, 



TFAP2A, EGR1, E2F1, and PITX2 (p <0.01). Other sig- 
nificant NGA includes MYC, which has been previously 
reported [28], and although MYC was present in both 4 
and 24 hour ERa regulatory networks, the level of sig- 
nificance was not high enough to be considered a hub 
(p = 0.14). 

Among the nine hubs that are significantly enriched in 
both 4 hour and 24 hour ERa networks, two facilitate 
genomic activities (ERa and Spl), while the other seven 
hubs (ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, 
PITX2) mediate non-genomic actions. With the excep- 
tion of (ZFP161, TFDP1, PITX2), the functions of (Spl, 
NRF1, E2F1, TFAP2A, EGR-1) and their functional rele- 
vance to estrogen action in breast cancer cells have 
been extensively documented in [29-32]. 

While the ERa regulatory network concept has 
recently been reviewed [33,34], our study is the first to 
characterize genomic and non-genomic mechanisms and 
their different functions. The genomic mechanism is sig- 
nificantly involved in cell proliferation and control of 
cell phases, confirming a significant effect of estrogen 
on cell cycle regulation. Biological processes significantly 
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affected by the non-genomic mechanism include RNA 
post-translation modification, cellular development, 
DNA replication, re-combination, and repair. Additional 
models describing network properties of estrogen signal- 
ing targets include the protein-protein interaction and 
the functional module networks [28]. The focus of the 
two networks is on the functional interpretation of the 
targets and not mechanism of regulation. Furthermore, 
the edges are interpreted as either protein interaction or 
functional similarity and are not directional, compared 
to the edges in our regulatory network, which have up 
or down-regulation direction. 

Antagonist/agonist effects of SERMs on ERa regulatory 
networks 

We observed full and partial antagonist/agonist effect of 
OHT on MCF7 after 24 hour E2 stimulation, similar to a 
previous study [18]. We further show that genomic and 
non-genomic actions of the ERa regulatory network are 
differentially influenced by full or partial antagonist/ago- 
nist activities of OHT and endoxifen. The current study 
clearly demonstrates that the E2 responsive ERa regula- 
tory network is disrupted by two SERMs (additional file 
4), but whether new networks are stimulated by these or 
other SERMs require additional investigation. 

Epigenetic Modifications of ERa Regulatory Network in 
the MCF7-T Cell 

A second application of the regulatory network was to 
examine the impact of epigenetics (DNA methylation 
and histone modifications) on the ERa regulatory net- 
work in a breast cancer cell model for acquired tamoxi- 
fen resistance of [16]. Transcriptionally active genes are 
typically marked by higher levels of di-/tri-methylated 
H3K4 (H3K4me2/3) and low trimethylated H3 lysine 27 
(H3K27me3) levels [35], and in hormone responsive 
MCF7 cells, E2-stimulated target genes have been 
shown to posses enriched regions of H3K4mel/2 [36]. 
In contrast, MCF7 with acquired tamoxifen resistance 
(MCF7-T), groups of previously E2-responsive genes are 
now associated with low H3K4me2 and high H3K27me3 
and are either downregulated or no longer strongly hor- 
mone inducible (Figure 8). The H3K27me3 mark is 
stable and invariably associated with transcriptional 
repression [37,38] and we show that this repressive his- 
tone modification plays a key role in the unresponsive 
ERa regulatory network in MCF7 cells with acquired 
resistance to tamoxifen (Figure 8). Although tumori- 
genic gene silencing mediated by H3K27me3 has been 
shown to occur in the absence of DNA methylation 
[38,39], repressive histone marks frequently coordinate 
with the more permanent mark of DNA methylation in 
heterochromatin [39-41]. We previously demonstrated 
that alterations in DNA methylation play an important 



role in acquired tamoxifen resistance [16]. By integrating 
both repressive epigenetic marks into our model, we 
demonstrate that H3K27me3 and DNA methylation sig- 
nificantly contribute to the non-responsive ERa regula- 
tory network model in tamoxifen resistant breast cancer. 
Furthermore, having recently demonstrated that many 
TFBSs are enriched in regions of altered DNA methyla- 
tion [42], we suggest that the functions of activators or 
repressors could be altered by changes to the DNA 
methylation landscape and further impact ERa networks 
in breast cancer, an active area of investigation in our 
laboratory. 

When we compare the percentages of different epige- 
netic mechanisms (Figure 7, 27%, 19%, 15%, 34%, 22%), 
to 20% each for a random gene set based on the 
selected thresholds, it seems that the non-responsive 
targets have similar distribution of various types of epi- 
genetic mechanisms as that of a random gene set. 
Therefore, it is possible that there may not exist specific 
patterns of epigenetic mechanisms in MCF7 cells' 
acquired tamoxifen resistance. 

Conclusions 

In breast cancer cells, we identified a number of estro- 
gen regulated target genes and the estrogen-regulated 
network that characterizes the causal relationships 
between transcription factors and their targets. This net- 
work has two major mechanisms, the genomic action 
and the non-genomic action. In genomic action, after 
estrogen receptor is activated by estrogen, estrogen 
receptor regulated genes through directing binding to 
DNA. In non-genomic action, estrogen regulated its 
gene targets through non-direct binding through other 
factors. In the estrogen regulated network, we found 
that though many non-genomic targets change over 
time, they do share many common factors and the con- 
sistency is highly significant. Moreover, we found that 
many gene targets of this network were not active any- 
more in anti-estrogen resistant cell lines, possibly 
because their DNA methylation and histone acetylation 
patterns have changed. Taken together, our model has 
revealed novel and unexpected features of estrogen- 
regulated transcriptional networks in hormone respon- 
sive and anti-estrogen resistant human breast cancer. 

Methods 

Chromatin immunoprecipitation and ChlP-Seq library 
generation 

Chromatin immunoprecipitation (ChIP) for Pol II (sc- 
899X, Santa Cruz, CA), H3K4me2 (Millipore, 07-030, 
Billerica, MA) and H3K27me3 (Diagenode, CS-069-100, 
Sparta, NJ) was performed as previously described [43]. 
ChIP libraries for sequencing were prepared following 
standard protocols from Illumina (San Diego, CA) as 
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described in [44]. ChlP-seq libraries were sequenced 
using the Illumina Genome Analyzer II (GA II) as per 
manufacturer's instructions. Sequencing was performed 
up to 36 cycles for mapping to the human genome 
reference sequence. Image analysis and base calling were 
performed with the standard Illumina pipeline, and with 
automated matrix and phasing calculations on the PhiX 



control that was run in the eighth lane of each flow-cell. 
Samples were run on duplicates. 

Methyl-CpG immunoprecipitation (MClp-seq) 

MCIp-seq was performed and followed the manufac- 
ture's protocol (MethylMiner, Invitrogen, Carlsbad, CA). 
Briefly, genomic DNA was sheared by sonication into 
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200-600-bp fragments, and methylated DNA was 
immuno-precipitated by incubating 1 [ig of sonicated 
genomic DNA for lh at room temperature with 3.5 \ig 
of recombinant MBD-biotin protein and Streptavidin 
beads. Methylated DNA was eluted with high-salt buf- 
fers (500 or 1,000 mmol/L NaCl), and then recovered by 
standard phenol chloroform procedure. The DNA frac- 
tions were subjected to library generation and followed 
by Illumina sequencing. Samples were run in duplicate. 

Quantitative ChlP-PCR 

To determine binding levels of H3K4me2 and 
H3K27me3 on target genes, quantitative ChlP-PCR was 
used to measure the amount of this sequence in anti- 
H3K4me2 or H3K27me3-immunoprecipitated samples 
by PCR with SYBR Green-based detection (Applied Bio- 
systems). Experimental quantitative ChlP-PCR values 
were normalized against values obtained by a standard 
curve (10-fold dilution, R 2 >0.99) constructed by input 
DNA with the same primer set. Specific primers for 
amplification are available upon request. 

Reverse transcription and quantitative PCR (RT-qPCR) 

Total RNA (1 |ig) was reverse transcribed with the 
Superscript III reverse transcriptase (Invitrogen, Carls- 
bad, CA). PCR was performed as described previously 
[45]. Specific primers for amplification are available 
upon request. The relative cellular expression of a cod- 
ing gene was determined by comparing the threshold 
cycle (Ct) of the gene against the Ct of GAPDH. 

Identification of differentially expressed genes and FDR 
calculation 

An empirical Bayes approach in the mixture-model fra- 
mework was developed to assess differential gene 
expression data from Affymetrix platform. Because the 
differential expression inference is made at the gene 
level rather than at the probe level, our model is an 
extension of Kendziorski's work [46,47]. In this model, 
between-gene variation, between-probe variation and 
between replicate are included. Specifically, let i index 
genes (i = 1.2.,...,/), 1 index conditions/groups/time (/ = 
1,2; 1 is the reference), / index probe set (J = 1,2,..., n t ) 
and k index replicate (k = 1,2,..., m,*). Let G t j k be the 
expression level of the /<th replicate on probe ; for gene 
i under group /. We consider the following random- 
effects model: 

Gijkl = Mi/ + + £ijkl 

bij ~ua MCof), Sijkl ~ imUm N(0,<$|) 

where (An is the gene expression level for gene i under 
condition l,bij represents the probe effect for the ;th 
probe of gene i and e^/ * s the error term (for genes 



with only one probe, the probe effect b is eliminated 
from model (1)). We consider that the genes come from 
three latent populations, each of which is characterized 
by the location of ^ (X variable) and [i i2 (Y variable) on 
a two-dimensional plane. The first population, a bivari- 
ate normal distribution with the center located above 
the y = x line, represents up-regulated genes. The sec- 
ond population, a normal distribution along y = x line, 
represents unchanged genes. The third population, a 
bivariate normal distribution with the center below the 
y = x line, characterizes down-regulated genes. Denote 
by Y { a latent indicator such that Y t = 1,0,-1 implies that 
gene i belongs to the first, second and third populations, 
respectively. Thus, we consider the following model for 

h =BN(^ i ; / / 1/ 5: 1 ) / 
/_i =BN(^ i ; V _ 1/ Z_i) / 

Pr[Y, = 1] = Pl ; Pr[Y, = -1] = Pr[Y, = 0] = p 0 , 

rin > rjn, £12 < %n, Pi + P-i + Po = 1, 

where /(.) is a function that takes value 1 if the argu- 
ment is logical/true and 0 if otherwise; BN and N 
denote the bivariate and univariate normal distributions, 
respectively. By integrating equations (1) and (2), one 
can use the Expectation-Maximization (EM) algorithm 
(Sl.doc) to estimate the parameter vector 9 = (p, T|i, H 1} 
T|_i, £_!,A,0,o,5). The posterior probability Pr[Y f = 0|G,£] 
can be interpreted as the probability that gene i is not 
differentiated. Rigorously speaking, Pr[Yf = ±1|G, 0] can- 
not be directly interpreted as the probability that gene i 
is up/downregulated. However, a probability close to 1 
indicates a good approximation. In our analysis, we 
claim that a gene is up-regulated if Pr[Y; = 1|G, 0] > c 
and /Xf2 — Ail > 0 or downregulated if 
Pr[Y f = -1\G,0] >c and Ai2 - An < 0. The local FDR 
can be easily estimated by 1 — Pr[Yi = 1|G, 0] or 
1 - Pr[Y f = -1|G,0][48]. In our analysis, we set c = 0.80. 
Models (1) and (2) are fitted to baseline and E2 stimu- 
lated (4 and 24 hours) expression data for MCF7 cells. 
In addition to FDR, we also set 20% fold-change in 
either up- or down-regulation in expression as the bio- 
logically significant effect size. Binding Scores for Peak 
Areas Identified by ChlP-chip and FDR Calculation is 
based on model-based analysis of tiling-arrays [49] . 

Motif binding site scan and FDR calculation 

Genomic Binding Sites: Each significant ChlP-chip peak 
binding site sequence of length 45 bp (25 bp of tiling 
array probes plus 10 bp up/downstream of each probe) 
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is scanned by all of the TF motifs in TRANSFAC data- 
bases. The range of binding scores for a transcription 
factor with motif M are divided into a number of small 
bins (k = 200). The number of scores fall into each bin 
is then calculated. If the number of any bin is lower 
than a pre-specified limit (t = m b 20), the bin is col- 
lapsed with neighboring bins until the number is beyond 
the limit. The number of scores that fall in each bin is 
denoted b by m b . Then, we randomly generate R = 
10,000 sequences based on human genome background 
using a 6 th order Markov model. This model assumes 
that a sequence element probability depends on 6 pre- 
vious bases, immediately preceding the current base 
[50]. The binding scores for these random sequences 
are calculated, and the number of scores that falls into 
each bin is denoted by n b . Finally, the local FDR, in 
terms of binding event for scores in bin b, is calculated 
as 



FDR bM = 



n h /R 
nihil' 



(3) 



where / is the total number of genes. In doing so, we 
force the bins below the midpoint of the score range to 
have FDR b>m = 1 because it is highly unlikely that these 
low score bins represent true binding events. Finally, we 
fit a cubic smoothing-spline to FDR b>m to get FDR s>m , 
the local FDR at score s (degree = 4, # of knots = # of 
unique FDR bm values). Then for each gene, we have the 
FDR estimate respect to the event that TF g binds to 
gene i's promoter. This non-parametric approach to 
estimate FDR was first described by Efron et al. [51] in 
differential gene expression data analysis. 

Non- genomic Binding Sites: We applied the same 
method as above to the motif binding scores collected 
from each gene promoter upstream 1Kb. 

Modulated empirical bayes model: DBGA, l-DBGA, and 
NGA mechanism determination based on ChlP-chip peak, 
TF motif scan and differential gene expression data 

Based on FDRs calculated from empirical Bayes models 
in differential gene expression, ChlP-chip binding peaks, 
and TF motif scan scores, DBGA, I-DBGA, and NGA 
targets were calculated using the flow- chart displayed in 
Figure 8. Graphical interpretations of different mechan- 
isms and their associated data types are displayed in Fig- 
ures SI and S2. In brief, both genomic and non-genomic 
targets must have significantly differentially expressed 
genes, while only genomic targets have significant ChlP- 
chip binding peaks. Finally, a DBGA has a significant 
ERa motif in the ChlP-chip binding sites, an I-DBGA 
has one or more significant TF motifs (other than ERa) 
in the ChlP-chip binding sites, and a NGA has one or 
more significant TF motifs in its target gene promoter. 



TF Hub significance calculation 

To quantify the significance of well-connected TF hubs, 
we consider the following null hypothesis: TFs that are 
involved in the regulation of differential genes are ran- 
domly picked from a pool of known TFs. Specifically, we 
suppose there are M differential genes. For each gene i, 
there are bi binding sites by ChlP-chip and motif search 
that pass the threshold, which involve n t {n t < b t ) unique 

M 

TFs. Therefore, there are a total of N = ^ xi{ involved 

i=i 

TFs. If there are n known TFs, then under the null 
hypothesis the number of connected nodes for each TF is 
the same as the number of times each TF appear from M 
random draws with each draw of size n t . Note that each 
draw of n t is without replacement because they represent 
distinct transcription factors. The distribution of the 
number of connected nodes ( T ) for any TF is 

e (n(- '))«-" 

Pr(T = t) = ^ , \ —> W 

n " 

i=i \ rii 

where Cl(t) is the set of all subsets of {1,2,...,M} with t 
elements. Hence, p-values associated with hub TFs can 
be obtained by calculating Pr(T > t obs ), where t obs is the 
observed number of genes regulated by the TF of inter- 
est. This calculation is programmed in R. 

Signal identification for ChlP-seq (Polll, H3K4me2, 
H3K27me3) and MClp-seq 

In order to evaluate transcriptional activity, activating 
and repressive histone methylation marks, and DNA 
methylation of ERa target genes, ChlP-seq data for 
RNA Pol II, H3K4me2, and H3K27me3 and MIRA-seq 
data DNA methylation were analyzed. Total sequences 
were normalized among replicates. For the ChlP-seq 
data, the signal intensity was measured as the number 
of ChlP-seq tags within the promoter region, defined as 
1,000-bp upstream of TSS (transcription start site). In 
the MClp-seq data, seq tags within upstream lOOObp 
and downstream lOOObp of the TSS were selected for 
promoter DNA- methylation. 

Identifications of agonist, antagonist, and partial agonist/ 
antagonist selective estrogen receptor modulator (SERM) 
targets 

Let (FC E2 , FC SE RM> FC E2 +serm) be the fold-change of 
gene expression after treatment of MCF7 cells with E2, 
SERM (OHT or endoxifen), or E2+SERM). We defined 
fold-change as gene expression in the treatment group 
over the control group for up-regulation; otherwise, it is 
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defined as the minus inverse ratio. In particular, if a 
gene is absent in both groups, the fold-change is defined 
as 1. A SERM has an agonistic effect on a gene if | 
FC S erm\ > [1 + "70% x (|FC£ 2 | - 1)L an antagonistic 
effect if \FC SE rm\ < [1 + 35% x (\FC E2 \ - 1)] and \FC E2 
+serm | < [1 + 50% x (|FCe 2 | - 1)]; otherwise, it has a 
partial agonistic/antagonistic effect. These agonist and 
antagonist activities have been defined previously [18]. 

Epigenetic mechanisms of non-responsive ERa network in 
4-hydroxy tamoxifen (OHT) resistant MCF7 cells 

For ERa targets in the ERa regulatory network 4 hours 
after E2 stimulation, five different epigenetic mechan- 
isms were investigated (additional file 5). 

♦ The first mechanism (additional file 5A) is the 
high-basal gene expression in the 4-OHT-resistant 
MCF7 cells, in which the threshold of high-basal 
gene expression is defined as its 80 th percentile. 

♦ The second mechanism (additional file 5B) is 
defined as the hyper-methylation: i.e., higher methy- 
lation level of OHT-resistant MCF7 than the paren- 
tal (hormone-responsive) MCF7. The threshold of 
this fold-change is defined as its 80 th percentile. 

♦ The third mechanism (additional file 5C) is defined 
as the hypo-methylation: i.e., lower methylation level 
of OHT-resistant MCF7 vs. MCF7. The threshold of 
this fold-change is defined as its 80 th percentile. 

♦ The fourth mechanism (additional file 5D) is 
defined as the high methylation in the OHT-resis- 
tant MCF7. The threshold of methylation level is 
defined as its 80 th percentile. 

♦ The fifth mechanism (additional file 5E) is defined 
as the high H3K27/K3K4 ratio, a gene repressive 
mark, in the OHT-resistant MCF7. The threshold of 
this ratio level is defined as its 80 th percentile. 

All other non-responsive ERa targets were categorized 
as "unknown". 

Additional material 



to MCF7-T cells; (C) hypomethylation from MCF7 cells to MCF7-H cells; 
(D) high basal methylation level in the MCF-T cells; (E) high H3K27/H3K4 
ratio. 

Additional file 6: is a jpeg file, indicating the concordance between 
differential Polll bindings and differential gene expression among 
genomic-targets, non-genomic targets, and none targets; and the 
concordance between H3K4 dimethylation among genomic-targets, 
non-genomic targets, and none targets. (A) The concordance of 
differential gene expression and Polll binding are before and after E2 
stimulation of MCF7 cells. (B) The concordance of differential gene 
expression and H3K4 dimethylation. 

Additional file 7: Supplementary Table 1 

Additional file 8: Supplementary Table 2 



Abbreviations 

ERa: estrogen receptor a; TF: transcription factor; E2: 1 7-estradiol; MCF7: 
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Additional file 1: is a jpeg file, indicating the situations of ligand- 
dependent genomic target, ligand-independent genomic target and 
non-genomic target 

Additional file 2: is a jpeg file, indicating the relationships between 
data and ERa mechanisms 

Additional file 3: is a jpeg file, indicating the effect of 40H- 
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endoxifen 

Additional file 5: is a jpeg file, indicating non-responsive 
mechanisms in ERa regulatory network in MCF7-T cell. (A) high basal 
gene expression in MCF7-T cells; (B) hypermethylation from MCF7 cells 
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